A Comparison on How Statistical Tests Deal with Concept Drifts

نویسندگان

  • Paulo M. Gonçalves
  • Roberto S. M. Barros
چکیده

RCD is a framework proposed to deal with recurring concept drifts. It stores classifiers together with a sample of data used to train them. If a concept drift occurs, RCD tests all the stored samples with a sample of actual data, trying to verify if this is a new context or an old one that is recurring. This is performed by a non-parametric multivariate statistical test to make the verification. This paper describes how two statistical tests (KNN and Cramer) can distinguish between new and old contexts. RCD is tested with several base classifiers, in environments with different rates-of-change values, with gradual and abrupt concept drifts. Results show that RCD improves single classifiers accuracy independently of the statistical test used.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Concept drift detection in event logs using statistical information of variants

In recent years, business process management (BPM) has been highly regarded as an improvement in the efficiency and effectiveness of organizations. Extracting and analyzing information on business processes is an important part of this structure. But these processes are not sustainable over time and may change for a variety of reasons, such as the environment and human resources. These changes ...

متن کامل

Concept Drift Detection with Hierarchical Hypothesis Testing | Proceedings of the 2017 SIAM International Conference on Data Mining | Society for Industrial and Applied Mathematics

When using statistical models (such as a classifier) in a streaming environment, there is often a need to detect and adapt to concept drifts to mitigate any deterioration in the model’s predictive performance over time. Unfortunately, the ability of popular concept drift approaches in detecting these drifts in the relationship of the response and predictor variable is often dependent on the dis...

متن کامل

Concept Drift Detection and Adaptation with Hierarchical Hypothesis Testing

In a streaming environment, there is often a need for statistical prediction models to detect and adapt to concept drifts (i.e., changes in the underlying relationship between the response and predictor data streams being modeled) so as to mitigate deteriorating predictive performance over time. Various concept drift detection approaches have been proposed in the past decades. However, they do ...

متن کامل

Fast Adapting Ensemble: A New Algorithm for Mining Data Streams with Concept Drift

The treatment of large data streams in the presence of concept drifts is one of the main challenges in the field of data mining, particularly when the algorithms have to deal with concepts that disappear and then reappear. This paper presents a new algorithm, called Fast Adapting Ensemble (FAE), which adapts very quickly to both abrupt and gradual concept drifts, and has been specifically desig...

متن کامل

A Simple Unlearning Framework for Online Learning Under Concept Drifts

Real-world online learning applications often face data coming from changing target functions or distributions. Such changes, called the concept drift, degrade the performance of traditional online learning algorithms. Thus, many existing works focus on detecting concept drift based on statistical evidence. Other works use sliding window or similar mechanisms to select the data that closely ref...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012